Learning Objectives

After completing this lesson, you’ll be able to:

Parallel Processing

Parallel processing is a way to improve performance on high-end machines by running multiple actions at once as a set of separate processes. A system architecture of multiple processors with multiple cores is helpful because the system can assign each process to a different core. With multiple processes running on multiple cores, the entire translation can run several times quicker than on a single core.

Parallel Processing and FME

Parallel processing in FME covers not an entire workspace but just a specific subsection of workspace. You control where parallel processing occurs through custom transformers, a natural method to subdivide a large translation.

Parallel processing runs on features grouped by attribute values, similar to Group-Based transformers. Custom transformers using parallel processing assign each group to a separate process.

Note

To process multiple workspaces in parallel, you would use multiple engines on FME Server.

Activating Parallel Processing

Each custom transformer has a set of parameters - located in the Navigator window - that relates explicitly to parallel processing. Here you can determine the level of parallel processing and the Group By attribute that define the parallel processing groups:

Parallel Processing setting in the Navigator

Parallel Processing Levels

The parameter that controls parallel processing allows different levels of effort to be applied. Each level specifies the number of processes to run simultaneously relative to the number of cores available:

Parameter Processes 2 Cores 4 Cores 8 Cores
No parallelism 1 Process 1 Process 1 Process 1 Process
Minimal Cores / 2 1 Process 2 Processes 4 Process
Moderate Cores x 1 2 Processes 4 Processes 8 Process
Aggressive Cores x 1.5 3 Processes 6 Processes 12 Process
Extreme Cores x 2 4 Processes 8 Processes 16 Process

As mentioned above, minimal parallelism results in two simultaneous FME processes on a quad-core machine. Extreme parallelism would result in eight (assuming the workspace has eight tasks to process simultaneously).

There is also a hard cap for each FME license level:

FME Edition Process Cap 2 Cores 4 Cores 8 Cores
Professional Edition 8 processes Maximum 4 processes Max 8 processes Max 8 processes
All Other Editions 16 processes Maximum 4 processes Max 8 processes Max 16 processes

So, if you have a Professional Edition license, you can never have more than eight processes at one time, regardless of machine type and the parallelism parameter. The quad-core machine in the above example can never have more than eight processes since that is the maximum that “extreme” parallel processing allows.

Note

These numbers - we should note - are the maximum number of processes at any one time. It’s possible to divide data into many more groups and process them separately, but they won’t coincide. For example, given a quad-core machine, moderate processing, and 20 groups of features, there will be a total of 20 processes, but only four will run at any one time. When one of the four finishes, a new one is started.

Parallel Processing Groups

The Parallel Process By parameter requires the author to select an attribute, and - as already covered - selecting attributes for a custom transformer parameter requires some consideration.

In particular, you cannot simply pick an attribute to use for this parameter’s value:

FME_PROCESS_GROUP_BY cannot be set to an attribute

Instead, you can publish this parameter (and one for Group By Mode) to make them available as one of the custom transformer parameters:

Publishing this parameter makes parallel processing mode available in custom transformers

Note

Parallel processing is most effective in two specific scenarios. The first scenario is a small number of groups, each with a large amount of processing. Parallel processing is less effective when there are a large number of small groups. A second scenario is many small tasks that the workspace is offloading elsewhere. For example, when the workspace passes features to a web service HTTPCaller transformer, it’s most efficient to have FME fire off as many requests as it can, as quickly as possible.

Parallel Processing and Custom Transformer Design

The keys to parallel processing are designing a custom transformer as a standalone subsection and defining the processing groups.

In essence, the custom transformer splits features into groups, processes each as a separate action, and then combines them back into a single group on exit.

It’s essential to be aware that there can be no relationship between features in different groups because each group is processed independently. If features are related, and their results depend on each other, they must be in the same group.

The easiest way to think of this is that the custom transformer itself is a Group-Based transformer. Because it is a wrapper for all the individual FME transformers in the definition, it allows those transformers to operate in groups, even if they are Feature-Based.

Note

Learn more about Group-Based and Feature-Based transformers.

Creating Artificial Groups

Sometimes none of the incoming data is related to each other and must be split into arbitrary groups for processing. In these cases, where there is no identifier to define groups, one can be created manually by generating attributes with the ModuloCounter or RandomNumberGenerator transformers.

For example, the author of the workspace below has many address features from which to create labels. To speed the process, they activate parallel processing:

Using parallel processing to speed use of a custom transformer

They could group by PostalCode but choose not to because all addresses fall inside only one of two postal codes. Therefore they create an artificial group using the ModuloCounter. Note the Group By parameter in the custom transformer is set to the _modulo_count attribute.

In such a scenario, it’s best to create only one group per intended FME process. For example, on a quad-core (four-core) computer, with an FME Professional license and Extreme parallel processing, a maximum of eight processes at a time are permitted. Therefore - as shown here - the optimum number of groups to use is also eight.

Note

See this blog article for more information about - and some special techniques for - generating parallel processing groups. It’s beneficial for a custom transformer with multiple input ports.